09. Experiment Sizing - Discussion
Re: Experiment Sizing
Because we have two evaluation metrics of interest, we should make sure that we
are choosing an appropriate significance level to conduct each test, in order
to preserve a maximum overall Type I error rate of .05. Since we would be happy
to deploy the new homepage if either download rate or license purchase
rate showed a statistically significant increase, performing both individual
tests at a .05 error rate carries the risk of making too many Type I errors. As
such, we'll apply the Bonferroni correction to run each test at a .025 error
rate so as to protect against making too many errors. If it were the case that
we needed to see both metrics with a statistically significant increase, then
we wouldn't need to include the correction on the individual tests.
For an overall 5% Type I error rate with Bonferroni correction and 80% power,
we should require 6 days (rounded up from 5.55) to reliably detect a 50
download increase per day and 21 days (rounded up from 20.44) to detect an
increase of 10 license purchases per day. 21 days is actually a convenient
number since the three-week timespan helps to account for weekly cycles. In
addition, the 21-day data collection period is a short enough timeframe that
running the experiment is a reasonable proposition. If the required experiment
length were a few weeks longer, then we might have needed to forego measuring
the license purchasing rate as a critical metric.
One thing that isn't accounted for in the base experiment length calculations
is that there is going to be a delay between when users download the software
and when they actually purchase a license. That is, when we start the
experiment, there could be about seven days before a user account associated
with a cookie actually comes back to make their purchase. Any purchases
observed within the first week might not be attributable to either experimental
condition. As a way of accounting for this, we'll run the experiment for about
one week longer to allow those users who come in during the third week a
chance to come back and be counted in the license purchases tally.
Validity, Bias, and Ethics
QUESTION:
Before you go on to the next part to analyze some simulated data, do you think that there would be any issues with the experiment in terms of validity, bias, or ethical guidelines?
SOLUTION:
These answers need to be solved by yourself, I believe you can do it